Language-dependent Fusion for Language Identification
نویسندگان
چکیده
A novel fusion approach for Language Identification called Languagedependent Fusion (LDF) is presented in this paper. A fusion system is a hybrid system which fuses the results from several individual sub-systems which utilize varied features, models, and/or classifiers. In LDF, instead of applying single fixed weighting coefficients to each sub-system, which happens in conventional approach such as Linear Score Weighting (LSW), varied weighting coefficients are applied to not only each sub-system but also to each language. Furthermore, instead of the experimental and statistical approach, weighting coefficients are calculated from the performance of each language-pair, which reflects the difference among languages. Experiments conducted on the OGI-92 multi-language database demonstrate a remarkable improvement when compared to individual sub-systems (45.46% error rate reduction) and commonly used fusion techniques such as LSW (33.33% error rate reduction) in a 10-language setting. Other advantages of LDF are also discussed.
منابع مشابه
Constructing and Validating a Q-Matrix for Cognitive Diagnostic Analysis of a Reading Comprehension Test Battery
Of paramount importance in the study of cognitive diagnostic assessment (CDA) is the absence of tests developed for small-scale diagnostic purposes. Currently, much of the research carried out has been mainly on large-scale tests, e.g., TOEFL, MELAB, IELTS, etc. Even so, formative language assessment with a focus on informing instruction and engaging in identification of student’s strengths and...
متن کاملمقایسه روش های طیفی برای شناسایی زبان گفتاری
Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...
متن کاملPatrol Team Language Identification System for DARPA RATS P1 Evaluation
This paper describes the language identification (LID) system developed by the Patrol team for the first phase of the DARPA RATS (Robust Automatic Transcription of Speech) program, which seeks to advance state of the art detection capabilities on audio from highly degraded communication channels. We show that techniques originally developed for LID on telephone speech (e.g., for the NIST langua...
متن کاملOffline Language-free Writer Identification based on Speeded-up Robust Features
This article proposes offline language-free writer identification based on speeded-up robust features (SURF), goes through training, enrollment, and identification stages. In all stages, an isotropic Box filter is first used to segment the handwritten text image into word regions (WRs). Then, the SURF descriptors (SUDs) of word region and the corresponding scales and orientations (SOs) are extr...
متن کاملFusing language information from diverse data sources for phonotactic language recognition
The baseline approach in building phonotactic language recognition systems is to characterize each language by a single phonotactic model generated from all the available languagespecific training data. When several data sources are available for a given target language, system performance can be improved using language source-dependent phonotactic models. In this case, the common practice is t...
متن کامل